Next: Replace, Previous: Regexp Example, Up: Search [Contents][Index]
Normally, you’d want search commands to disregard certain minor differences between the search string you type and the text being searched. For example, sequences of whitespace characters of different length are usually perceived as equivalent; letter-case differences usually don’t matter; etc. This is known as character equivalence.
This section describes the Emacs lax search features, and how to tailor them to your needs.
By default, search commands perform lax space
matching: each space, or sequence of spaces, matches any
sequence of one or more whitespace characters in the text.
(Incremental regexp search has a separate default; see Regexp Search.) Hence,
‘foo bar’ matches ‘foo
bar’, ‘foo
bar’, ‘foo
bar’, and so on (but not
‘foobar’). More precisely, Emacs matches
each sequence of space characters in the search string to a
regular expression specified by the variable
search-whitespace-regexp. For example, to make
spaces match sequences of newlines as well as spaces, set it to
‘"[[:space:]\n]+"’. The default value of
this variable depends on the buffer’s major mode; most
major modes classify spaces, tabs, and formfeed characters as
whitespace.
If you want whitespace characters to match exactly, you can
turn lax space matching off by typing M-s SPC
(isearch-toggle-lax-whitespace) within an
incremental search. Another M-s SPC turns lax space matching back on. To
disable lax whitespace matching for all searches, change
search-whitespace-regexp to nil; then
each space in the search string matches exactly one
space.
Searches in Emacs by default ignore the case of the text they are searching through, if you specify the search string in lower case. Thus, if you specify searching for ‘foo’, then ‘Foo’ and ‘foo’ also match. Regexps, and in particular character sets, behave likewise: ‘[ab]’ matches ‘a’ or ‘A’ or ‘b’ or ‘B’. This feature is known as case folding, and it is supported in both incremental and non-incremental search modes.
An upper-case letter anywhere in the search string makes the
search case-sensitive. Thus, searching for
‘Foo’ does not find
‘foo’ or ‘FOO’.
This applies to regular expression search as well as to literal
string search. The effect ceases if you delete the upper-case
letter from the search string. The variable
search-upper-case controls this: if it is
non-nil (the default), an upper-case character in
the search string make the search case-sensitive; setting it to
nil disables this effect of upper-case
characters.
If you set the variable case-fold-search to
nil, then all letters must match exactly, including
case. This is a per-buffer variable; altering the variable
normally affects only the current buffer, unless you change its
default value. See Locals. This
variable applies to nonincremental searches also, including those
performed by the replace commands (see Replace) and the minibuffer history
matching commands (see Minibuffer
History).
Typing M-c or M-s c
(isearch-toggle-case-fold) within an incremental
search toggles the case sensitivity of that search. The effect
does not extend beyond the current incremental search, but it
does override the effect of adding or removing an upper-case
letter in the current search.
Several related variables control case-sensitivity of
searching and matching for specific commands or activities. For
instance, tags-case-fold-search controls case
sensitivity for find-tag. To find these variables,
do M-x apropos-variable RET
case-fold-search RET.
Case folding disregards case distinctions among characters,
making upper-case characters match lower-case variants, and vice
versa. A generalization of case folding is character
folding, which disregards wider classes of distinctions
among similar characters. For instance, under character folding
the letter a matches all of its accented cousins
like ä and
á, i.e., the match disregards the
diacritics that distinguish these variants. In addition,
a matches other characters that resemble it, or have
it as part of their graphical representation, such as
U+249C PARENTHESIZED LATIN SMALL LETTER A and
U+2100 ACCOUNT OF (which looks like a small
a over c). Similarly, the
ASCII double-quote character "
matches all the other variants of double quotes defined by the
Unicode standard. Finally, character folding can make a sequence
of one or more characters match another sequence of a different
length: for example, the sequence of two characters
ff matches U+FB00 LATIN SMALL LIGATURE
FF. Character sequences that are not identical, but match
under character folding are known as equivalent character
sequences.
Generally, search commands in Emacs do not by default perform
character folding in order to match equivalent character
sequences. You can enable this behavior by customizing the
variable search-default-mode to
char-fold-to-regexp. See Search
Customizations. Within an incremental search, typing M-s
' (isearch-toggle-char-fold) toggles character
folding, but only for that search. (Replace commands have a
different default, controlled by a separate option; see Replacement
and Lax Matches.)
Like with case folding, typing an explicit variant of a
character, such as ä, as part of the
search string disables character folding for that search. If you
delete such a character from the search string, this effect
ceases.
Next: Replace, Previous: Regexp Example, Up: Search [Contents][Index]